Hyperparameter Selection under Localized Label Noise via Corrupt Validation

نویسندگان

  • David I. Inouye
  • Pradeep Ravikumar
  • Pradipto Das
  • Ankur Datta
چکیده

Existing research on label noise often focuses on simple uniform or classconditional noise. However, in many real-world settings, label noise is often somewhat systematic rather than completely random. Thus, we first propose a novel label noise model called Localized Label Noise (LLN) that corrupts the labels in small local regions and is significantly more general than either uniform or class-conditional label noise. LLN is based on a k-nearest neighbors corruption algorithm that corrupts all neighbors to the same wrong label and reduces to a class-conditional label noise if k = 1. Given this more powerful model of label noise, we propose an empirical hyperparameter selection method under LLN that selects better hyperparameters than traditional selection strategies, such as cross validation, by synthetically corrupting the training labels while leaving the test labels unmodified. This method can provide an approximate and more robust validation for hyperparameter selection. We design several label corruption experiments on both synthetic and real-world data to demonstrate that our proposed hyperparameter selection yields better estimates than standard methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semiparametric Localized Bandwidth Selection in Kernel Density Estimation

Since conventional cross–validation bandwidth selection methods do not work for the case where the data considered are serially dependent, alternative bandwidth selection methods are needed. In recent years, Bayesian based global bandwidth selection methods have been proposed. Our experience shows that the use of a global bandwidth is however less suitable than using a localized bandwidth in ke...

متن کامل

Objective selection of hyperparameter for EIT.

An algorithm for objectively calculating the hyperparameter for linearized one-step electrical impedance tomography (EIT) image reconstruction algorithms is proposed and compared to existing strategies. EIT is an ill-conditioned problem in which regularization is used to calculate a stable and accurate solution by incorporating some form of prior knowledge into the solution. A hyperparameter is...

متن کامل

Continuous Regularization Hyperparameters

Hyperparameter selection generally relies on running multiple full training trials, with hyperparameter selection based on validation set performance. We propose a gradient-based approach for locally adjusting hyperparameters during training of the model. Hyperparameters are adjusted so as to make the model parameter gradients, and hence updates, more advantageous for the validation cost. We ex...

متن کامل

A The Power of Localization for Efficiently Learning Linear Separators with Noise

We introduce a new approach for designing computationally efficient learning algorithms that are tolerant to noise, and demonstrate its effectiveness by designing algorithms with improved noise tolerance guarantees for learning linear separators. We consider both the malicious noise model of Valiant [Valiant 1985; Kearns and Li 1988] and the adversarial label noise model of Kearns, Schapire, an...

متن کامل

Outlier Robust Gaussian Process Classification

Gaussian process classifiers (GPCs) are a fully statistical model for kernel classification. We present a form of GPC which is robust to labeling errors in the data set. This model allows label noise not only near the class boundaries, but also far from the class boundaries which can result from mistakes in labelling or gross errors in measuring the input features. We derive an outlier robust a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017